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FINGERPRINTING BACTERIAL STRAINS 
USING REPETITIVE DNA SEQUENCE AMPLIFICATION 

FIELD OF THE INVENTION 
The present invention relates generally to the use of olig-onucleotide 
probes directed to repetitive DNA sequence elements to identify bacteria. 
More particularly, it relates to the use of these probes as primers for the 
amplification of bacterial genomic DNA between repetitive sequences, and 
the use of these amplification products to construct DNA fingerprints 
unique to the probed genome. It also relates to the disclosure of specific 
primers which are useful as oligonucleotide probes in the practice of this 
invention. 

BACKGROUND OF THE INVENTION 
Interspersed repetitive DNA sequence elements have been 
15 characterized extensively in eucaryotes although their function still 

remains largely unknown. The conserved nature and interspersed 
distribution of these repetitive sequences have been exploited to amplify 
unique sequences between repetitive sequences by the polymerase chain 
reaction. Additionally, species-specific repetitive DNA elements have been 
20 used to differentiate between closely related murine species. 

Prokaryotic genomes are much smaller than the genomes of 
mammalian species (approximately lO*' versus 10^ base pairs of DNA, 
respectively). Since these smaller prokaryotic genoaaes are maintained 
through selective pressures for rapid DNA replication and cell 
25 reproduction the non-coding repetiiive DNA should be kept to a minimum 

unless maintained by other selective forces. For the most part 
prokaryotes have a high density of transcribed sequences. Nevertheless, 
families of short inter genie repealed sequences occur in bacteria. 



wo 93/08297 



PCr/US92/09230 



The presence of repetitive sequences has been demonstrated in 
raany different bacterial species. Reports ox novel repeated sequences in 
the eubacterial genera, Escherichia, Salmonella, Deinococcus, Calothrix, 
and Neisseria, and the fungi, Candida albicans and Pneumocystis carinii, 
5 illustrate the presence of dispersed extra genie repetitive sequences in 

many organisms. One such family of repetitive DNA sequences in 
eubacteria is the Repetitive Extragenic Palindromic (REP) elements. The 
consensus REP sequence for this family includes a 38 mer sequence 
containing six totally degenerate positions, including a 5 bp variable loop 
10 between each side of the conserved stem of the paliD.drome. Another 

family of repetitive elements is the Enterobacterial Repetitive Intergenic 
Consensus (ERIC) sequences. ERIC is larger (consensus sequence is 126- 
mer) and contains a highly conserved central inverted repeat. The ERIC 
and REP consensus sequences do not appear to be related. 
15 Previous studies have used repeated rRNA genes as probes in 

Southern blots to detect restriction fragment length polymorphisms 
(RFLPs) between strains. Repeated tRNA genes have been used as 
consensus primer binding sites to directly amplify DNA fragments of 
different sizes by PGR amplification of different strains. Limitations of 
20 both techniques include the use of radioisotope and time-intensive 

methods such as Southern blotting and polyacrylamide gel electrophoresis 
to clearly disttaguish subtle differences in the sizes of the DNA fragments 
generated. The latter technique coiild only distinguish organisms at the 
species and genus level. The tDNA-PCR fingerprints are generally 
25 invariant between strains of a given species and between related species. 

Other previous studies include the use of species-specific repetitive DNA 
elements as primer-binding sites for PCR-based bacterial species 
identification. Though such methods allow species identification by PGR 
wath picogram amoimts of DNA, only single PGR products are generated 
30 which precludes the generation of surain-specinc genomic fingerprints. 
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Although these previous studies demonstrated that species-specific 
repetitive DNA elements can be used as primer-binding sites for PCR- 
based bacterial species identification, these methods only generated single 
PGR products in a single species. The present invention provides a novel 
5 approach to using extragenic repetitive sequences to directly fingerprint 

bacterial genomes. Analysis of amplification products resulting from 
.amplifying unique sequences between primers to bacterial DNA repeat 
sequences, reveals unique distances between repeat sequences. This 
pattern of distances uniquely fingerprints different bacterial species and 
10 strains. Thus, this approach provides a quick and reliable method to type 

bacteria by genomic fingerprinting. 

SUMMARY OF THE INVENTION 
An object of the present invention is a method of identifying a 
strain of bacteria by amplifying the DNA between repetitive DNA 
15 sequences and measuring the pattern of sized extension products. 

An additional object of the present invention is provision of primer 
pairs to bacterial repetitive DNA sequences. 

A further object of the present invention is a method of identifying 
a strain of bacteria in samples from physiological and non-physiological 
20 sources. 

An additional object of the present invention is a method for 
diagnosing bacterial disease in humans and animals. 

A further object of the present invention is the detection of bacterial 
disease or contamination in plants, 
25 An additional object of the present invention is the monitoring of 

bacterial contamination in foods, 

A further object of the present invention is a method for developing 
a library of fingerprints to identify specific strains of bacteria. 
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An additional object; of the present invention is a method for 
' monitoring' bacterial contamination in soils, liquids, solids and other 
samples from environmental sources. 

A further object of the present invention is a method for monitoring 
5 manufacturing processes for bacterial contamination. 

An additional object of the present invention is a method for quality 
assurance or quality control of microbiological based laboratory assays, 

A further object of the present invention is a method for genomic 
mapping 

10 * An additional object of the present invention is the monitoring of 

bacierial populations in bioremediation sites. 

A farther object of the present invention is the monitoring of 
bacterial infections. 

An additional object of the present invention is a method for the 
15 automated identification of a bacterial straiu. 

A firrther object of the present invention is a machine for the 
automated identification of bacterial strains. 

Thus, in accompUshing the foregoing objects, there is provided in 
accordance with one aspect of the present invention a method for 
20 identifying a strain of bacteria in a sample, comprising the steps of: 

amplifying DNA between repetitive sequences in the bacteria by adding a 
pair of outwardfy-directed primers to the sample, which primers are 
capable of hybridizing to repetitive DNA sequences in the bacterial DNA, 
and extending outwardly from one hybridizable repetitive sequence to 
25 another hybridizable repetitive sequence; separating the extension 

products generated in the amplification step by size; and determining the 
specific straixi of bacteria by measuring the pattern of sized extension 
products. 

In specific embodiments of the present invention the primers are 
30 between about 10 to 29 mer and preferably between about 15 to 25 mer. 



wo 93/08297 



PCT/LIS92/09230 



The primers can be specific to any repetitive sequence but in the 
preferred embodiments are specific to ERIC, REP, Ngrep or Drrep, 

In various aspects of the present invention the method can be used 
for: (1) diagnosis of bacterial disease, in plants animals and humans; (2) 
5 monitoring for bacterial content and/or contamination in the environment; 

(3) monitoring food for bacterial contamination; (4) monitoring 
manufacturing processes for bacterial contamination; (5) monitoring 
quality assurance/quality control of laboratory tests involving 
microbioiogicaJ assays; (6) tracing bacterial contamination and/or 
10 outbreaks of bacterial infections; (7) genome mapping; (8) monitoring 

bioremediation sites; and (9) monitoring agricultural sites for test crops, 
bacteria and recombinant molecules. 

The method is useful' on pure or isolated cultures as well as actual 
samples from the test site. In a preferred embodiment multiple primers 
15 to different repetitive DNA can be used. 

Because of the simplicity of the test it can also be automated for 
rapid and quick assay of samples. 

A further aspect of the present invention is a machine for 
automating the identification of bacterial strains. 
20 Other and further objects, features and advantages will be apparent 

from the following description of the presently preferred embodiments of 
the invention, which are given for the purpose of disclosure, when taken 
in conjimction with the accompanying drawings, 

DESCRIPTION OF THE DRAWINGS 
25 Figure 1 is a schematic showing the binding of outwardly-directed primers. 

Figure 2 shows the alignment of various REP oligonucleotide primer 
sequences vnth respect to a REP consensus sequence. 
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Figuxe 3 shows the alignment of ERIC oligonucleotide primer sequences 
with respect to the central inverted repeat of an ERIC consensus 
sequence. 

Figure 4 shows PGR amplification of E. coli strain W3110 genomic DNA 
5 with different REP and ERIC oligonucleotide primer sets. 

Figure 5 is a 1% agarose gel demonstrating the specificity of ERIC 
oligonucleotide primer/template interactions. 

Figure 6 shows the results of REP-PCR of strains within the Gram- 
negative enterobacterial species. 

10 Figure 7 shows the results of ERIC-PCR of strains within the Gram- 

negative enterobacterial species. 

Figure 8 shows a 'hug blot" hybridization of REP in a wide variety of 
bacteria- 
Figure 9 shows the evolutionary conservation of REP sequences* 

15 Figure 10 shows a 'hug blot" hybridization of ERIC in a wide variety of 
bacteria. 

Figure 11 shows the evolutionary conservation of ERIC sequences. 
Figure 12 shows the evolutionary conserv'ation of Ngrep. 
Figure 13 shows REP/ERIC Fingerprinting of subtilis. 
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Figure 14 is REP-PCR of E, coli W3110 genomic cosmid library. 

Figure 15 is REP-PCR of E. coli W3110 genomic cosmid library. 

Drawings are not necessarily to scale. Certain features of the 
invention may be exaggerated in scale or shown in schematic form in the 
5 interest of clarity and conciseness. 

DETAILED DESCRIPTION OF THE INVENTION 
It will be readily apparent to one skilled in the art that various 
substitutions and modifications may be made to the invention disclosed 
herein vnthout departing from the scope and spirit of the invention. 

10 DNA amplification as used herein refers to any process which 

increases the number of copies of a specific DNA sequence. A variety of 
processes are known. One of the most commonly used is the Polymerase 
Chain Reaction (PGR) process of MuUis as described in U.S. Patent Nos. 
4,683,195 and 4,683,202 both issued on July 28, 1987. In general the PGR 

15 ampHfication process involves an enzymatic chain reaction for preparing 

exponential quantities of a specific nucleic acid sequence. It requires a 
small amount of a sequence to initiate the chain reaction and 
oligonucleotide primers which will hybridize to the sequence. In PGR the 
primers are annealed to denatured nucleic acid followed by extension with 

20 an inducing agent (enzyme) and nucleotides. This results in newly 
synthesized extension products. Since these newly synthesized sequences 
become templates for the primers, repeated cycles of denaturing, primer 
annealing, and extension results in exponential accumulation of the 
specific sequence being amplified. The extension product of the chain 

25 reaction will be a discrete nucleic acid duplex with a termini corresponding 

to the ends of the specific primers employed. In the present invention the 
extension product traverses from one repetitive sequence to another 
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repetitive sequence. Since the repetitive sequences are interspersed 
tiiroughoufc the genome at different distances from each other, there mil 
be exponential growth of all the different sizes. The pattern of extension 
products of different sizes provides a specific ilngerprint for each bacteria, 
5 The term "oUgonucIeotide primer" as used herein defines a molecule 

comcrised of more than three deoxvriboiiucleotides or oligonucleotides. 
Its es:act length will depend on many factors relating to the ultimate 
function and use of the oligonucleotide primer, including temperature, 
source of the primer and use of the method. The oligonucleotide primer 
10 can occur naturally (as a purified fragment or restriction digestion 

product) or be produced synthetically. The oligonucleotide primer is 
capable of acting as an initiation point for synthesis, when placed xmder 
conditions which induce synthesis of a primer extension product 
complementary to a nucleic acid strand. The conditions can include the 
15 presence of nucleotides and an inducing agent such as a DNA polymerase 

at a suitable temperature and pH. In the preferred embodiment the 
primer is a single-stranded oligodeoxyribonucleotide of sufficient length to 
prime the synthesis of an extension product from a specific sequence in 
the presence of an inducing agent In the present application in the 
20 preferred embodiment the oligonucleotides are usually between about 10 

mer and 29 met. In the preferred embodiment they are between 15 and 
25 mer- Sensitivity and specificity of the oKgonucleotide primers are 
determined by the primer length and uniqueness of sequence within a 
given sample of a template DNA. Primers which are too short, for 
25 example, less than 10 mer may show non-specific binding to a wide variety 

of sequences in the genomic DNA and thus are not very helpful. 

Each primer pair herein is selected to be substantially 
complementary to the different strands of each specific repetitive sequence 
to which the primer pairs bind. Thus one orimer of each pair is 
30 sufficiently complementary to hybridize '^Azh a part of the sequence in the 
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sense strand and the other primer of each pair is sufficiently 
complementary to hybridize with a different part of the same repetitive 
sequence in the anti-sense strand. 

It should also be recognized that a single primer can be considered 
5 a primer pair in this invention. Because the primer binds to repetitive 

sequences and because the repetitive sequences can be orientated in both 
directions, a single primer can bind to both strands of a repetitive 
sequence and amplify the sequence between two separate repetitive 
sequences. 

10 As used herein the term "outwardly directed" primer pair refers to 

the oligonucleotide primers and their binding as seen in Figure 1. In the 
present application one primer is substantially complementary to the sense 
strand. This primer binds to the sense strand in such an orientation that 
the extension product generated from the 3 ' end of the primer extends 

15 away from the repetitive DNA sequence to which the oligonucleotide 

primer is bound and across the non-repetitive DNA to a second repetitive 
DNA sequence. The other member of the primer pair binds to the anti- 
sense strand. This primer binds in an orientation such that extension 
products generated on the 3 ' end extends away from the repetitive DNA 

20 sequence to which the primer is bound and across the non-repetitive DNA 

to the next repetitive DNA sequence, Thus^ within a specific repetitive 
DNA sequence the primer pair is bound to the complementary DNA 
strands 5' to 5* (see Figure 1) and, thus, the extension products grow 
away from each other across the non-repetitive DNA. The extension 

25 products from the two paired primers are complementary to each other 

and can serve as templates for further synthesis by binding the other 
member of the primer pair. 

As used herein the term "extension oroduct" refers to the nucleotide 
sequence which is synthesized in the presence of nucleotides and an 

30 inducing agent such as a polymerase from the 3* end of the 
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oligonucleotide primer and which is complementary to the strand to which 
the oligonucleotide primer is bound. 

As used herein the term ''differentially labelled" shall indicate that 
the extension product can be distinguished from all the others because it 
5 has a different label attached or is of a different sise or binds to a specific 

oligonucleotide or a combination thereof. One skilled in the art will 
recognize that a variety of labels are available. For example, these can 
include radioisotopes, fluorescers, chemiuminescers, enzjones and 
antibodies. Various factors affect the choice of the label. These include 
10 the effect of the label on the rate of hybridization and binding of the 

primer to the DNA, the sensitivity of the label, the ease in making the 
labeled primer, probe or extension products, the abiHty to automate, 
available instmmentation, convenience and the like. For example in one 
embodiment of the present invention size alone is sufficient to distinguish 
15 the patterns and thus no other label is needed. The size differences can 
be determined after staining the DNA, for example with ethidium 
bromide. However, when detecting multiple species in a sample or for 
multiple repetitive sequences it may be advantageous to add a radioactive 
label such as or ^^C; a different fluorescer such as fluorescehi, 

20 tetranaethythodamine,TexasRedor4-chloro-7-nitroben20-2-oxa-l*diazole 
(NBD); or a mixture of different labels such as radioisotopes^ flour escers 

and chemiuminescers, 

■i 

The term "repetitive DNA" as used herein refers to non-coding 
sequences of DNA containing short repeated sequences and dispersed 
25 throughout the bacterial genome. 

As used herein (1) "REP'* refers to the repetitive extragenic 
palindromic elements. The REP consensus sequence is shovm inSEQ, ID, 
NO. I. (2) "ERIC" refers to the enterobacterial repetitive intersrenic 
consensus sequence. The ERIC consensus sequence is shown in SEQ. ID. 
30 NO, 36. (3) "Ngrep" refers to the Neisseria repetitive elements. The 
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Ngrep consensus sequence is shov/n in SEQ, ID. NO. 46. (4) "Drrep" 
refers to the Deinococcus repetitive elements. The Drrep consensus 
seqnence is shown in SEQ.- ID. NO. 60. These repetitive elements are 
found interspersed throughout the bacterial genome. These four 
5 sequences or any combination of these four sequences can be used in the 

present invention. Further, one skilled in the art will understand that as 
new repetitive sequences in bacteria become known they can also be used 
in the method of the present invention. By binding outwardly-directed 
primers to these repetitive sequences and performing amplification one 
10 can - generate unique fingerprints and identify individual strains of 

bacteria. 

The oligonucleotide primers may be prepared using any suitable 
method known in the art. For example the phosphodiester, and 
phosphotriester methods or automated embodiments thereof It is also 

16 possible to use a primer which has been isolated from biological sources 

such as with a restriction endonuclease digest. 

The repetitive sequence to which the primers bind can be selected 
from any of the repetitive regions that are found in bacteria. The 
repetitive sequences can be identified by a variety of methods. This may 

20 be done manually fay comparing the sequences of the published nucleic 

acid sequences for bacterial genomes. A more convenient method, 
however, is to use a computer program to compare the sequences. In this 
way one can generate a consensus DNA sequence for use in the methods 
of the present application. 

25 Any source of bacterial nucleic acid in purified or non-purified form 

can be utilized as starting material, provided it contains or is suspected of 
containing a bacterial genome of interest. Thus, the bacterial nucleic acids 
may be obtained from any source which can be contaminated by bacteria. 
When looking for bacterial infection or in distinguishing bacteria from 

30 human or animal subjects, the sample to be tested can be selected or 
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extracted from any bodily sample such as blood, urine, spinal fluid, tissue, 
vaginal swab, stool, amniotic fluid or buccal mouthwash. 

In other applications the sample can come from a variety of sources. 
For example: (1) in horticulture and agricultrarai testing the sample can 
5 be a plant, fertilizer, soil, Uquid or other horticultural or agricultural 

product; (2) in food testing the sample can be fresh food or processed food 
(for example infant formula, seafood, fresh produce and packaged food); 
(3) in environmental testing the sample can be liquid, soil, sewage 
treatment, sludge and any other sample in the environment which is 
10 considered or suspected of being contaminated by bacteria. 

When the sample is a mixture of material for example blood, soil 
and sludge it can be treated within an appropriate reagent which is 
effective to open the cells and expose or separate the strands of nucleic 
acids. Although not necessary, this iysing and nucleic acid denaturing 
15 step will aUow amplification to occur more readily. Further, if desired, the 
bacteria can be cultixred prior to analysis and thus a pure sample obtained. 
This is not necessary, however. 

The inducing agent for polymerization may be any compound or 
system which wiH function to accompUsh the synthesis of primer extension 
20 products. Examples of inducing enzymes which have been used for this 
pinrpose include E. coli DNA polymerase I, Klenow fragment of E. coli 
DNA polymerase I, T4 DNA polymerase, Tag DNA polymerase, Vent DNA 
polymerase and other available DNA polymerases. 

As used herein "fingerprinting'' refers to the fact that each strain of 
25 bacteria has it-s characteristic size pattern of extension products which can 
be used to identify the bacterial strain. This imique pattern is each 
straia^s genomic fingerprint. 

One embodiment of the present invention includes a method for 
identifying a strain of bacteria in a sample, comprising the steps of: 
SO amphfying DNA by adding a pair of outwardly- directed primers to the 
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sample, wherein the primers are capable of hybridizing to repetitive DNA 
sequences in the bacterial DNA and extending outwardly from one 
hybridizable repetitive sequence to another hybridizabie repetitive 
sequence; separating the extension products generated in the amplification 
5 step by size; and determining the specific strain of bacteria by measuring 

the pattern of sized extension products. 

It will be recognized that the separating step of this embodiment 
may be accomplished by any number of techniques and methods which 
will separate the extension products by size. Examples include but are not 

10 limited to gel electrophoresis, capillary electrophoresis, chromatography, 

pulsed field gel electrophoresis and mass spectrometry. Thus, one skilled 
in the art will recognize that the separation of extension products can be 
done by a variety of methods. The choice of method will depend on a 
number of factors, such as the available laboratory equipment, the amount 

15 of extension product present, the label if any, the dye, the preference of 

the party performing the testing, convenience and the like. 

Capillary electrophoresis allows the rapid separation of DNA 
fragments through tiny polyacrylamide gels in thin capillaries. The chief 
advantage is that much larger voltages can be appHed and resolution is 

20 enhanced. The process can be automated. Once tubes are loaded, 

electrophoresis and data acquisition can be automated by direct connection 
to computer. An example includes the Model 270A-HT High Throughput 
Capillary Electrophoresis System (Applied Biosystems). Instead of bands 
on a gel, the DNA fragments are represented by spikes as a function of 

25 time indicating the presence of different molecules of different sizes. 

Another advantage is that not only can PCR-generated spike patterns be 
quickly obtained with greater resolution of different-sized fragments, but 
intensity of different bands could be accurately quantitated; permitting 
even greater resolution. 
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Non-electrophoresis methods, aameiy chromatography, canbeiised 
to separate PCR-generated DNA fragments by size. High-Perfonnance 
Liquid Chromatography (HPLC) methods can be used to separate DNA 
firagments by the use of size-exclusion colimms (Series 800 HELC 
o Gradient System - BioRad), DNA fragments are represented by spikes as 
a fonction of time and the data is digitized and fed directly to a computer. 
Electrophoresis methods, however, are usually preferred because of greater 
reliability and resolution. 

One skilled in the art vnll recognize that measurement of the 
10 pattern of sized extension products to determine the specific strain of 
bacteria present may also be accomplished by several means, direct 
visualization or by automation using a bar code reader, a laser reader, 
digitizer, a photometer, a fluorescence reader or computer planimetry. 
The choice of measurement method depends in part on the separation step 
15 and available instrumentation. 

A variety of primers can be used to detect repetitive sequences in 
bacteria. The primers depend on which repetitive sequence is being 
detected and which bacteria are being detected. One skilled in the art can 
readily determine which repetitive sequence and prhners to use depending 
20 on the bacteria being examined. In the embodiment of the present 

invention primer pairs have been selected from the sequences in the 
groups consistiag of SEQ. ID. NOS. 4 to 35, 38 to 45 and 48 to 57. In the 
preferred embodiment when REP is being used for the primer annealing 
site one member of the pair of primers is selected from the group 
25 consisting of SEQ, ID, NOS, 4, 5, 6, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 

20, 21 and 34 and the other member of the pair is selected from the group 
consisting of SEQ. ID. NO, 7, 8, 9, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, and 35. In the most preferred embodiment SEQ, ID. NO. 4 and SEQ. 
ID. NO. 7 are used. 
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When the repetitive sequence is ERIC one member of the pair of 
primers is selected from this group consisting of SEQ. ID. NO* 38, 39, 40, 
and 41 and the other member of the pair is selected from SEQ. ID. NOS. 
42, 43, 44 and 45 and in the most preferred embodiment SEQ. ID. NO. 38 
5 and SEQ. ID. NO. 42 are used. 

When the repetitive sequence is Ngrep one member of the pair of 
primers is selected from the group consisting of SEQ. ID. NOS. 48, 49, 50 
and 51 and the other member of the pair is selected from the group 
consisting of SEQ. ID. NOS. 52, 53, 54 and 55 In the preferred 

10 embodiment SEQ. ID. NO. 48 and SEQ. ID. NO. 52 are .used. 

When the repetitive sequence is Drrep only a single primer is used. 
The primer is either SEQ. ID, NOS. 56 or 57. This is an example where 
a single primer acts as a pair. Use of both primers will not result in 
unique fingerprint patterns. 

15 One skilled in the art will readily recognize that as more repetitive 

sequences are determined the primer pair which gives the best fingerprint 
pattern can be easily selected. For example, a primer to the new sequence 
is synthesized and the method of the present invention is seen. After 
examining the resulting pattern from each primer pair the primer pair 

20 which best distinguishes the specific test strains can be identified. 

In addition to the above described method a plurality of pairs of 
primers can be added to the method. Each pair of primers will bind a 
different repetitive sequence. For example, any combination of two or 
more of each of REP, ERIC, Ngrep and Drrep primer pairs can be added. 

25 Further, the multiprimer assay can be enhanced by differentially labeling 

the primer pairs. Thus, after amplification, not only can the sized pattern 
be examined, but the size pattern for each label can be examined. For 
example, REP and ERIC oligonucleotide primers can be used. Each is 
labeled with a different fluorescent labeL The resultant differential 
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labeling* pattern can be determined by fluorescence scanning. This 
procedure can provide finer fingerprint patterns. 

After electrophoresis m polyacrylamide gels, the gels are scanned 
by a laser -based fluorescence detector and the results digitized directly by 
5 computer connected to the detector. Further, using a Genescanner 
(Apphed Biosystems) allows the entire process to be automated. 

One skilled in the art will readily recognize that this method has 
many advantages. It can be readily modified for automated identification 
of strains of bacteria. In one embodiment the amplification is done in an 
10 auto-PCE instrument, the extension products are removed and separated 
on a sizing gel or by chromatography. The sizing pattern is determined 
by an automatic reader and each pattern can be recognized by a computer 
means. The computer will store fingerpriats of known bacteria for 
comparison vdth test results. In the automated method, bar code readers, 
15 laser readers, digitizers, photometers, fluorescent readers and computer 
planimetry can be used to help automate the system. 

In another embodiment of the present invention there is a kit for 
determining the identity of strains of bacteria. This kit comprises a 
container having a pair of outwardly- directed PGR primers to a repetitive 
20 sequence in bacteria. This kit can have any of the PGR primers selected 
from the group consisting of SEQ, ID. NO. 4 to 35, 38 to 45 and 48 to 57 
or combination thereof One skilled in the art will readily recogni2:e that 
the number and type of primers which are in the kit will depend on the 
use of the kit as well as the sequences which are to be detected- 
25 A further embodiment of the present invention is a machine for 

identifying a strain of bacteria comprising an automated PGR amplifying 
mean^, a separation means, a sampling means for removing the extension 
products from the PGR means and transferring them to the separation 
means, a reading means for measuring patterns of extension products 
30 after separation of the separation means, a computer means for recording 
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the results of the reading means and for outputting the pattern of and 
identifying the strain of bacteria, 

A number of automated PGR amplifying means are known on the 
market. For instance a thermal cycler can be used. There are a number 
5 of arms or robotic devices and other automatic pipette and sampling 

machines which can be used as a sampling means for removing the 
.extension products from the PGR reaction at the appropriate times and 
transferring the sample for either chromatography, gel or capillary 
electrophoresis, mass spectrometry or other methods or techniques used 

10 to separate the samples. In the preferred embodiment the separator 

means is regulated by the computer. After the separation the reader 
means is used to measure the pattern. The reader means will depend on 
the type of separation which is being used. For instance a wavelength 
densitometer reader or a fluorescence reader can be used depending on the 

15 label being detected. A radioisotope detector can be used for radioisotope 

labeled primers. In mass spectrometry the ions are detected in the 
spectrometer. A gel can be stained and read with a densitometer. The 
computer regulates the automated PGR amplification procedure, the 
sampling and removal from PGR, the automatic separation and reading of 

20 the samples and can be used to interpret the results and output the data* 

The methods, instruments and procedures described herein can be 
used for a variety of purposes. Because of the sensitivity and specificity 
of the test one skilled in the art will readily recognize uses for this 
methodology. What follows is not an inclusive list of uses but only a 

25 sampling of specific areas where a current need exists for a quick and 

reliable test. 

The methodology of the present invention can be used for 
diagnosing bacterial diseases whether it is in plants, animals or humans. 
Not only can the disease be diagnosed but the specific strain involved can 
30 be identified. 
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The environinent can be monitored for bacterial containiiiation. 
The procedure works on variety of samples including liquids, sludge, 
sewage treatinent plant samples and soil.. Thus, anjwhere that there has 
been environmental contamination that needs monitoring, the test will 
5 work. 

This procedure should be very useful in the area of monitoring food 
contamination. A variety of producers of foods currently test their 
products for bacterial contamination. This procedxire will help facilitate 
this testing. For example infant formula, seafood, fresh produce and 
10 processed food can all be readily tested by this procedure. This procedure 
can also be useful to detect the source of food poisoning. 

Another important use of this method is in the monitoring of the 
bacterial populations in a bioremediation site. Bioremediation usually 
uses specific bacterial populations to destroy the contamination. The 
15 bacteria can be from the natural population growing at the site or bacteria 
added to the site to enhance the breakdown. The bacteria used in the 
enhancement procedures usually are from cultures and/or sludge. In any 
of these instances it is important to monitor the population of bacteria in 
the bioremediation state to make sure that the appropriate straia(s) of 
20 bacteria is present and growing. This procedure allows the rapid and 

quick identification of the bacteria in the population so that it can be 
readily monitored. The test works on samples of soil, Hquid, sludge or 
other material to be added to the bioremediation site. 

In the areas of horticulture and agriculture a variety of uses of this 
25 method axe found. One can monitor bacterial inoculations of plants or 

bacterial disease of plants. It can also be used to monitor the distribution 
of recombinant bacteria added to the environment. Samples can come 

Jim 

from the soil where bacteria have been added, or from fertilizer to make 
sure that the fertiliser has the appropriate bacteria. It can be used to 
0 monitor pest control where bacteria are added in order to kill pests such 
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as insects. This procedure allows quick and accurate monitoring of the 
appHcation of the bacterial insecticide and the activity of the insecticide. 
Thus, in any horticulture or agriculture procedure which requires the 
addition of bacteria the bacteria can be monitored throughout the 
5 procedure. 

Another application of this method is in the manufacturing process, 
A number of manufacturing processes for instance drugs, microorganism- 
aided synthesis, food manufacturing, chemical manufacturing and 
fermentation process all rely either on the presence or absence of bacteria. 
10 In either case the method of the present invention can be used. It .can 

monitor bacterial contamination or test that strain purity is being 
maintained* 

This method can also be used to test stored blood for bacterial 
contamination. This would be important in blood banking where bacteria 

15 such as Yersinia enterocolitica can causa serious infection and death if it 

is in transfused blood. 

The procedure can also be used for quality assurance and quality 
control in monitoring bacterial contamination in laboratory tests. For 
example the Guthrie bacterial inhibition assay uses a specific strain of 

20 bacteria to measure phenylalanine in newborn screening. If this strain 

changes it could affect test results and thus affect the accuracy of the 
newborn screening program. This method of the present invention can be 
used to monitor the strain's purity. Any other laboratory test which uses 
or relies on bacteria in the assay can be monitored. The laboratory or test 

25 environment can also be monitored for bacterial contamination by 

sampling the lab and testing for specific strains of bacteria. 

This procedure will also be useful in hospitals for tracing the origin 
and distribution of bacterial infections. It can show whether or not the 
infection of the patient is a hospital-specific strain. The type of treatment 
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and specific anti-bacterial agent can depend on the source and nature of 
the bacteria. 

The following examples are offered by way of example, and are not 
•intended to limit the scope of the invention in any manner, 

5 Example 1 

Isolation and Quantitation of Genomic DNA 
1* Genomic DNA from Gram-negative and spirochete bacteria. 
Cells were pelleted and washed twice in 1 ml of IM NaCi by 
centrifngation in a fixed angle microfuge at 15,000 rpm for 5 mim Cells 
10 were washed twice and resxispended in TE (lOmM Tris, 25mM EDTA, pH 

8.0) and incubated in 0,2 mg/ml lysozyme and 0-3 mg/ml RNase A for 20 
min at 37"^ C, If lysis by lysozyme was not visible with refractory 
pathogenic strains, 0,6% SDS was added. To these suspensions, 1% 
Sarkosyl and 0.6 mg/ml proteinase K were added, and the ceils were 
15 incubated for 1 hr at 37'' C. Ceil lysates were extracted twice with phenol 

and twice with chloroform. The aqueous phase was precipitated with 
0.33M NH^ acetate and 2-5 volumes of ethanoL Precipitated threads of 
DNA were removed with a sterile Pasteur pipette tip, and dissolved in TE 
(lOmM Tris, 1 mM EDTA, pH 8.0). 

20 2. Genomic DNA from Gram-positive bacteria. 

Concentrated cell pellets were washed twice in 1 M NaCl and twice 
in TB (60 mM Tris, 50 mM EDTA, pH 7.8) and spun in a faxed^angle 
microfuge for 5 min.. Ceil pellets were resuspended. in TE and incubated 
with 250 U/ml mutanolysin and 0.3 mg/ml RNAse A for 30 min at 37° C. 

25 To this reaction, 0.6% SDS and 0,6 mg/ml proteinase K were added, and 
the mixture was incubated for 1 hr at 37^ C, followed by 65" C for 45 min, 
Lysates were extracted twee with phenol and twice with chloroform. 
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Chromosomal DNA wa5 precipitated and dissolved as described for Gram- 
negative bacteria. 

In both instances the genomic DNA was quantitated by 
spectrofluorimetry at excitation and emission wavelengths of 365 nm and 
5 460 nm respectively using the DNA-specific dye, Hoechst 33258 and* a 

flnorometer- 

Example 2 
Primer Synthesis 
The oligonucleotide primers - were synthesized by the 

10 phosphoramidite method using an automated DNA synthesizer, and DNA 
sequence information from consensus sequence data. The primers were 
labeled by 5' end-labeling of each oligonucleotide as described by Maniatis 
et al. Molecular cloning: A laboratory manual. Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY (1982). Fifty pmol of each 

15 primer were used vnth 20 U T4 polynucleotide kinase and 5 p.1 of y^^^P- 

ATP (6000 Ci/mmol). Labeled DNA was separated from unincorporated 
isotope by diluting the 50 p.1 reaction volume to 1,0 ml in millipore water, 
followed by centrifugation of this solution through Centricon-3 (Amicon) 
tubes. The retained solution contains the hybridization probe, 

20 Oligonucleotides were quantitated by UV-VIS spectrophotometry with 

absorption measured at 260 nm. 

Example 3 
Primer Desim 

L REP oligonucleotide sequences are shown in SEQ. ID. NOS. 
25 1 to 35 and Fig. 2. Degenerate 38-mer REPALL (SEQ, ID. NOS. 2 and 3) 

probes were designed which encompassed the entire consensus REP 
sequence (SEQ, ID. NO. 1). Other REP oligonucleotide probe pairs, each 
representing part of the conserved consensus sequence, were designed 
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with, opposite orientations such that the 3 ' ends were directed outwards 
from each REP sequence. This design constituted a pair of outwardly- 
directed primers. If the primer to one side of the sequence is shorter than 
the other, inosines may be added to make the lengths of the pair equal. 
Total degeneracy is represented either by any one of the four common 
bases (A, G, C, or T) placed at specific positions, or inosines placed at 
specific positions, Inosine contains the purine base, hypoxanthine, and is 
capable of forming Watson-Crick base pairs with A, G, C, or T, Positions 
can he partially degenerate with two of these four bases placed at specific 
positions as chosen from the consensus REP sequence. 

2. ERIC oligonucleotide sequences are shown in SEQ. ID. NOS. 
36 to 45 and Fig. 3. The ERICALL oligonucleotide is a 44 mer of SEQ. 
ID. NO. 37 from position 42 to 85 and contains the entire conserved 
central core inverted repeat. Non-degenerate, consensus ERICIR (SEQ. 
ID, NO. 38) and ERIC2 (SEQ. ID. NO. 42) oHgonucleotides were designed 
from each half of this core inverted repeat v/ith opposite orientations such 
that the 3' ends are directed outwards from the center of the ERIC 
element* 

3. Ngrep sequences are shown in SEQ. ID. NOS. 46 to 55. 
SEQ. ID. NO, 46 is the conserism sequence. 

4. Drrep sequences are shovm in SEQ, ID. NOS, 56, 57 and 60. 
SEQ. ID. NO, 60 is the consensus sequence. 

Example 4 
Membrane Supported Hybridization 
A single membrane containing genomic DNA from 39 different 
eubacterial species representing 7 of 10 different phyla of eubacteria as 
defined by Woese, Microbiol Rev,, 51:221-271 (1987), based on rDNA 
sequence comparisons, was named the '"bug blot;' This bug* blot was made 
by adding 100 ng of denatured genomic DNA per slot, from each species 
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listed in Figs. 8 or 10,. on Gene Screen Plus (Du Pont) membranes. The 
bug blot represents a slot blot DNA:DNA hybridization of genomic DNA 
probed with ''P-end labeled SEQ. ID, NO. 3 (Fig. 8) or SEQ. ID. NO. 42 
(Fig. 10). These membranes were pre treated as described in Maniatis 
5 et al. Genomic DNAs were denatured in solution by heating to 100° 0 for 

5 min. DNA samples were then applied to the membrane, and 500 p.1 0*4N 
NaOH were added to each slot. Membranes were rinsed in IX SSC, and 
blotted dry with Whatman paper. Membranes were baked at 80° for 1 
hr, and stored in sealed plastic bags at -20° 0. 

10 The hybridization solution was prepared as described in Noda, A, 

et aL Biotechniques 10:474-477 (1991) for use with oligonucleotide probes 
on a membrane containing ordered lambda phages representing the E. coli 
W3110 genome. For REP oligonucleotide hybridization, membranes were 
prehybridized at 42° 0 for 1.5 hrs. The probe was denatured at 100'' C for 

15 5 min. Probe was added at 1 x lO^cpm/ml hybridization solution and the 

membranes were incubated at 42''C for 15 hrs. ERIC oligonucleotide 
prehybridizations and hybridizations were both performed at 65° C. After 
incubation the membranes were washed twice at room temperature for 10 
min with 2X SSPE and 0.1% SDS, followed by one final wash (REP, 37° C 

20 for 15 min; ERIC, 40"* for 1 min). Autoradiogranis were exposed on 

Kodak X-Omat film with two intensifying screens at -SS^'C for 24 hrs. 

Example 5 
DNA Amplification 
There are a number of DNA amplification methods available in the 
25 art. These generally depend on the use of one or more of a variety of 

poi3Tnerase5 to catalyze DNA chain extension (polymerization) from 
component nucleotide bases. An example of a DNA amplification 
technique is the polymerase chain reaction (PGR) used here to catalyze 
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the extension of the oligonucleotide primer from the 3' end along the 
DNA template to which the primer is hybridized. 

Each 25 y.1 of PGR reaction contained 50 pmol each of 2 opposing 
primers, 100 ng of template (genomic) DNA, 1.25 mM of each of 4 dNTPs, 
5 2 U AmpliTaq DNA polymerase (Perkin-Ekner/Cetus) in a buffer with 
10% DMSO (v/v), PGR amplifications were performed in an automated 
thermal cycler, with an initial den a tur at ion at 95''C for 7 min, followed 
by 30 cycles of denaturation at 90"* C for 30 sec, foUowed by annealing 
(REP, 40^0 for 1 min; ERIC, 52^0 for 1 min), and then extension (65^ C 
10 for 8 min), with a single final extension (65^0 for 16 min). All PGR 

reaction tubes were placed in internal rows of the thermal cycler and all 
peripheral tubes were surrounded by "dummy" tubes containing water and 
mineral oil. Five to eight of each PGR reaction volume were then 
electrophoresed directly on 1% agarose gels containing Ix TAE (Tris 
15 acetate-EDTA), and 0.5 jig/ml ethidium bromide. The gels were 

photographed with 20 second exposures to Polaroid type 55 film. 

Example 6 
REP Primers 

Genomic DNA from lysed E. coli W3110 cells served as the test 
20 sample, REPIR-I (SEQ. ED. NO. 4) and REP2-I (SEQ. ID. NO. 7) 
oligonucleotides were used as the pair of outwardly-directed primers, PGR 
amphfLcation was accomplished as described above. Separation of 
amplification products was accomplished on 1% agarose - Ix Tris-acetate- 
EDTA gel, and the pattern of sized, extension products was determined 
25 using ethidium bromide to stain the DNA. No template DNA was added 

to the negative control lanes. REPIR-I and REP2-I primers were used in 
negative control lane 11. Results axe sho\\-n in Figure 4, The inosine- 
containing outwardly-directed primer pair, REPIR-I and REP2-I, provided 
the most distinct genomic fingerprint of coli strain W3110 chromosomal 
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■ DNA. REP oligonucleotides (Fig. 2) were all tested as primers for DNA 
amplification because these outwardly-directed primers can amplify DNA 
between successive REP sequences in any orientation. The inosine- 
containing primer, however, provided more distinct DNA amplification 
5 band patterns and less smearing, possibly because each primer is 

represented by a single primer sequence instead of a pool of multiple 
primer sequences as with REPIR-D (SEQ. ID, NO, 5) and REP2-D (SEQ. 
ID. NO. 8). Each REP primer alone yielded visible amplification products 
of relatively limited complexity. This result probably stems from the fact 

10 that each side of the inverted repeat has a slightly different consensus 

sequence. The use of both primers REPIR-I and REP2-I appears to allow 
optimal annealing with both sides of the conserved stem of each REP-like 
sequence in the genome. Inefficient amplification with REPALL-I (SEQ. 
ID. NO. 2) and REPALL-D (SEQ. ID. NO. 3) was observed presumably 

15 because a palindrome is present in the primer. Because of the possible 

self-hybridization between REPALL primers of opposite orientation it was 
not possible to design the primers to the complete REP consensus 
sequence in both orientations. 

Example 7 

20 ERIC Primers 

Using genomic DNA from lysed E, coLi W31I0 cells as the test 
sample, ERICIR (SEQ, ID. NO. 38) and ERIC2 (SEQ. ID. NO. 42) 
oligonucleotides were used as the pair of outwardly-directed primers. PGR 
ampUfication was accomplished as described above. Separation of 

25 amplification products was accomplished on 1% agarose - Ix Tris-acetate- 

EDTA gel, and the pattern of sized extension products was determined 
using ethidium bromide to stain the DNA. No template DNA was added 
to the negative control lanes. ERICl R and ERIC2 primers were used in 
negative control lane 15. Results are shown in Figure 4, Amplification 
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resxilts obtained with the single consensus ERIC primer set, ERICIR and 
ERIC2 (Fig. 4), were matched in complexity by the results obtained \yith 
ERIC2 alone (Fig, 4). In contrast PGR amplification with ERICIR alone 
yielded limited amplification products (Fig. 4). Two possible reasons for 
5 this observation are that either greater sequence conservation exists in the 

side of the inverted repeat complementary to ERIC2 or homologous, 
unrelated sequences complementary to ERIC2 exist outside ERIC elements 
in the genome* 

Example 8 

10 Specificity of Primer/Template Interactions. 

PGR reactions using primer binding sites at known distances from 
ERIC sequences were performed to verify the size of amplification 
products. Specificity of REP primer/template interactions was 
demonstrated by amphfication between a known REP sequence and a Tn5 
15 insertion in the glpD gene of jB. coli. The ERIC primers generated PGR 

products of the expected sizes after amplification of Kohara lambda phages 
containing the E. coli hsdR locus and an adjacent ERIC sequence. Results 
axe shown in Fig* 5, The Kohara lambda phages used are listed by clone 
ninnfaers and miniset serial numbers are shown ia parentheses. One p.1 of 
20 each Kohara phage lysate was used as template DNA. PGR conditions 

were as described above. Lanes 2-5 represent PGR amplifications with 
primers within the hsdR gene, hsdR4-2758 (SEQ. ID. NO. 58) and hsdR- 
3235R (SEQ. ID. NO. 59). Lanes 7-10 represent PGR products generated 
by primers hsdR -f 2758 and ERIC2. Lane 6 is a blank lane where nothing 
25 was added to the gel. The molecular weight marker is a 1-kb ladder. The 

gels were 1% agarose-lx Tris-acetate-EDTA and contained 0.5 \ig of 
ethidium bromide per ml. The specincirv* of ERIC-PCR was confirmed by 
PGR amplification of a defined DNA segmen!: becween a pubUshed ERIC 
sequence, (a/k/a as IRU sequence) and a sequence within the E. coli hsdR 
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gene using the ordered Kohara phage library. Single PGR products of the 
expected size were amplified both within the hsdR gene and between the 
hsdR and ERIC sequences carried by Kohara phages containing the E, coLi 
hsdR locus. Amplification with only a single hsdR primer failed to 3deld 
5 any product, 

Examole 9 

X 

Distinguishing Between Strains of Bacteria With REP 
REP primers were used to distinguish different strains within 
Gram-negative enterobacterial species. Fig. 6 shows extension products 

10 generated by PGR amplification of enterobacterial genomic DNA with the 

REP primers, REPIR-I (SEQ. ID. NO. 4) and REP2^I (SEQ, ID. N0.7). 
PGR reactions were performed as described above. No template DNA was 
added to the negative control lane. The DNA molecular weight marker is 
a 1-kb ladder. The gels were 1% agarose- Ix Tris-acetate-EDTA and 

15 contained 0.5 }ig of ethidium bromide per ml. 

The REP-PCR genomic fingerprint of different strains from several 
bacterial species revealed distinct patterns as shown in Fig, 6, PGR 
amplification of DNA from multiple strains of different enterobacterial 
species using primers REPIR-I and REP2-I demonstrated subspecies or 

20 strain-specific band patterns. Strains within a species could be 

tmambiguously identified. In lanes 2 and 3 (Fig. 6), E. coli K-12 strains 
HBlOl and W3110 were distinguished clearly by an extra band of 
approximately 400 bp in W3110. Laboratory strains of coli K-12 were 
related to each other and distinct from the pathogenic strains of jB, coli. 

25 Interestingly the Salmonella typhimurium laboratory stain LT-2 revealed 

a close similarity to Salmonella typhi scrain 2304. Both of these strains 
showed REP-PCR patterns clearly distinci: from other pathogenic 
Salmonella strains of undetermined species. The two Klebsiella 
pneumoniae strains shown were obtained from different sources and 
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showed different banding patterns. In lanes 14-15 and lanes 20-21 
identical strains of pathogenic Salmonella and Enterabacter sakazakii 
respectively were represented by identical REP-PCR patterns- 
Example 10 

5 Distinguishing Between Strains of Bacteria With ERIC 

ERIC primers were used to distingnish different strains within 
Gram-negative enterobacterial species. Fig. 7 shows extension products 
generated by PGR amplification of enterobacterial genomic DNA with the 
ERIC oHgonncleotide primers, ERICIR (SEQ. ID. NO. 38) and ERIC2 
10 (SEQ. ID. NO. 42) (Fig 2). PGR reactions were performed as described 

above. No template DNA was added to the negative control lane* The 
DNA molecular weight marker is a 1-kb ladder. The gels were 1% 
agarose-lx Tris-acetate-EDTA and contained 0.5 \ig of ethidium bromide 
per ml. 

15 The ERIC-PCR genomic fingerprint of different strains from several 

bacterial species revealed distinct patterns as shown in Fig. 7- PGR 
amplification of DNA from multiple strains of different enterobacterial 
species nsing primers ERICIR and EEIC2 demonstrated species specific 
band patterns. The complexity, however, was less than that obtained with 

20 REP-PCR (Fig, 6) and the differences between species were easier to 

distinguish. This decrease in complexity of the genomic fingerprints, 
however, made it more difficult to make fine distinctions between strains, 
for example -S. coli laboratory strains HBlOl and W3110, Greater ERIC- 
PCR pattern differences existed when comparing laboratory strains ofE. 

25 coli to pathogenic strains of the same species than between laboratory E. 

coli to pathogenic Shigella, The ERIC-PCR patterns of greatest 
complexity were observed vnth Salmonella and these results are consistent 
with previous data base searches revealing an abundance of ERIC in the 
Salmonella. Because both REP and ERIC PCR yielded common bands 



wo 93/08297 



PCT/US92/O9230 



* 

-29- 

between the strains of a given species it provides the ability to group 
strains within a certain species. 

Example 11 
Evolutionary Conservation of REP Sequences 
5 Figs. 8 and 9 show the use of REP primers to demonstrate the 

evolutionary conservation of REP sequences. In Fig. 8 is a hsting of 
bacterial and non-bacterial species which match the genomic DNA in each 
slot of the bug blot hybridization presented in Fig. 8. The bug blot 
represents a slot blot DNA:DNA hybridization of genomic DNA probed 

10 with ''P-end-iabeled REPALL-D (SEQ. ID. NO, 3). Filters were prepared 
and hybridizations were performed as described above. Fig, 9 shows two 
gels of PGR amplification products of bacterial genomic DNAs used in the 
bug blot hybridization with REP primers, EEPIR-I and REP2J. These 
PGR reactions are presented in exactly the same order as the slots of the 

15 bug blot. All PGR reactions were performed as described above. No 
template DNA was added to the negative control lane. The DNA 
molecular weight marker is a 1-kb ladder. The gels were 1% agarose-lx 
Tris-acetate-EDTA and contained 0.5 of ethidium bromide per ml' 

Slot blot hybridization of the bug blot with SEQ. ID, NO. 3 (Fig, 8) 

20 indicates that Gram-negative enterics and related species from the same 
phyla comprise the majority of REP -positive species. Hybridizations with 
REPALL-I and REP2-I yielded results similar to the hybridization with 
REPALL-D, The 38 mer REP ALL probes were used for hybridization 
because the increased length provides a longer homologous stretch and 

25 hence greater stability for hybridization. As expected several species of 

Gram-positive bacteria and spirochetes in addition to the phylogenetically 
distant eukaryotic fungi failed to yield hybridization signals. Surprisingly, 
however, hybridization signals were observed with distantly related 
radioresistant bacteria Deinococcus radiophilus^ the green non-sulfur 
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bacterium, Herpetosiphon giganteus^ and the archaebacterium, 
Halobacterium halobium. 

PCK amplification of these same bacterial species with primers 
REPIR-I and REP2-I yielded results consistent with the bug blot 
5 hybridization described above. The species that showed the most intense 

hybridization signals in Fig. 8 generally demonstrated the most complex 
amplification patterns by EEP*PCR (Fig. 9), PGR amplification of 
genomic DNA from different species clearly revealed species-specific REP- 
PCE patterns (Fig, 9). 

10 Example 12 

Evolutionary Conser^/ation of ERIC Sequences 
Figs, 10 and II show the use of ERIC primers to demonstrate the 
evolutionary conservation of ERIC sequences. In Fig. 10 is a listing of 
bacterial and non-bacterial species which match the genomic DNA in each 
15 slot of the bug blot hybridization presented in Fig. 10. The bug blot 
represents a slot blot DNArDNA hybridization of genomic DNA probed 
with ^^-end-labeled ERIC2, Filters were prepared and hybridizations 
were performed as described above. Fig. 11 shows two gels of PGR 
amplification products from bacterial genomic DNAs used in the bug blot 
20 hybridization with the ERIC primers, ERICIR and ERIC2, These PGR 
reactionus are presented in exactly the same order as the slots of the bug 
blot. All PGR reactions were performed as described above. No template 
DNA was added to the negative control lane. The DNA molecular weight 
marker is a 1-kb ladder. The gels were 1% agaros'e-lx Tris-acetate-EDTA 
25 and contained 0,5 of ethidium bromide. 

The ERIC primers showed similarity of hybridization and PGR 
amplification. It should be noted that hybridization with ERICALL 
yielded results consistent with hybridization ^vlth ERIC2. Gram negative 
enterics and related species from the same family comprised a majority of 
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ERIC positive species and as suspected several species of gram positive 
bacteria and spirochetes in addition to the fungi failed to yield 
hybridization signals. Similarly to REP (Ex. 11) the radioresistant, green 
non-sulfur bacterium and the archaebacterium yielded hybridization 
5 signals. 

ERIC-PCR also provided results (Fig. 11) consistent with ERIC 
hybridization of the bug blot (Fig. 10). Gram-negative enteric species 
yielded the amplification patterns of greatest complexity (Fig, 11). Most 
Gram-positive species (e.g. Bacillus subtilis) showed minimal ERIC-PCR 
10 amplification (Fig, 11). This result is consistent with computer searches 
of ERIC in DNA sequence databases and known phylogenetic distances 
between Gram-positive bacteria and Gram-negative enteric bacteria. 

Example 13 
Bacterial DNA Fingerprint Library 
15 The method described above was used to screen a plurality of 

different bacterial strains. The pattern for each strain was categorized 
and stored. This comprehensive library of fingerprints was used to 
compare with unknown samples to determine the strain identity. 



Example 14 

20 Whole Cell PGR 

Gram-negative bacterial colonies are picked with disposable loops, 
and the cells on the edge of the loops are placed directly into PGR tubes 
containing PGR reaction buffer. The repetitive sequence oligonucleotide 
primers are then added with dNTP's and DNA polymerase, and the PGR 

25 reactions are carried out. Presumably during the initial denatnration step 

at 94° C the cells lyse and the chromosomal DNA released into solution 
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serves as the template for PGR amplification. Thus DNA isolation and 
purification prior to PGR amplification is not always necessary. 

Example 15 
Genome. Mapping 

5 REP -PGR was performed on purified cosmid DNA from the ordered 

Tabata cosmid library. Tafaata, et al., J BacterioL, 171:1214-1218 (1989). 
This library covers approximately 70% of the E. coli strain W3110 
genome. This library represents a set of overlapping or isolated cosmids 
which contain genomic DNA from different locations on the E. coli 
10 chromosome. Each individual cosmid DNA is purified and used as 
template DNA in the PGR reaction. 100 ng DNA of each cosmid DNA 
(represented by serial nimibers in Figs. 14-15) is used as template in REP- 
PGR with primers REPIR^I and REP2-I (50 pmoi of each primer). The 
PGR products are then eiectrophoresed in 1% agarose, Ix TAB, and 
15 stained vrith 0.5 micrograms per ml. ethiditun bromide. 

As is evident from Figs. 14-15, the different cosmids have different 
REP-PCR fingerprints^ depending on which segment of the genome is 
inserted into a particular clone. By matching fingerprint patterns from 
individual clones with the computer, contiguous (contigs) and ordered 
20 stretches of overlapping clones are built. 

Further, this fingerprinting method provides a useful tool for 
checking the integrity of the library and the purity of each clone. One 
skilled in the art will readily recognise that these libraries can be made 
mfirom cosmid, phage, or any possible DNA (even RNA) vector. 
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Examole 16 
Fingerprinting Bacterial 
Strains Used in Newborn Screening 
The techniques of the present invention were used to monitor the 
5 validity of bacterial strains used in newborn screening. REP-PCR, ERIC- 

PGR, and combined REP/ERIC - PGR were performed on Bacillus subtilis 
strains, ATCC 6633 and 6051, which are used for newborn screening of 
phenylketonuria (PKU) and mapie-s3n*up urine disease (MSUD) 
respectively. In the Fig. 13, REPIR-I, REP2-I, ERICIR, and ERIC2 (50 
10 pmol of each primer) were used in single PGR reactions (REP-ERIC - 

PGR) on individual samples ol Bacillus subtilis genomic DNA. One strain 
that was supposedly ATCC 6633 turned out to be an anomalous strain 
(lane 12) that was clearly distinct from the others. The strain used for 
MSUD diagnosis, ATGG 6051, was distinguished from the strain used for 
15 PKU diagnosis, ATCC 6633. No template DNA was added to the negative 

control lane, PGR products were electrophoresed on 1% agarose gels in 
Ix TAE and stained with 0.5 micrograms per ml. ethidium bromide. It 
should also be noted that this example shows the combination of two sets 
of different primers and their simultaneous use in identifying and 
20 fingerprinting strains of bacteria. 

Example 17 
Ngrep 

In Fig. 12 are the results of PGR using Ngrep primers (SEQ. ID. 
NOS, 48 and 52). As can be readily seen, the different strains of Neisseria 
25 can be distinguished. The conditions are as described in previous 

examples except the denaturing and annealing steps occur at 94"^ C for 1 
min. and 38° G for 1 min., respectively- The negative control has no DNA 
template but includes primers. 
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All patents and publications mentioned in this specification are 
indicative of the levels of those skilled in the art to which the invention 
pertains. All patents and pubhcations ' are herein incorporated by 
reference to the same extent as if each individual publication was 
5 specifically and individually indicated to be incorporated by reference. 

One skilled in the art wiU readily appreciate that the present 
invention is well adapted to carry out" the objects and obtain the ends and 
advantages mentioned, as well those inherent therein. The outwardly- 
directed primers, along with the methods and procedures described herein 
10 are presently representative of preferred embodiments, are exemplary, and 

not intended as limitations on the scope of the invention. Changes therein 
and other uses will occur to those sldlled in the art which are 
encompassed within the spirit of the invention or defined by the scope of 
the claims. 
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(C) TELEX: 762829 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LEMGTH: 38 base pairs 
5 (B) TYPE: nucleic acid 

{C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) KOr.ECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 

10 (ix) FEATURE: 

(A) OTHER INEORKATION: /note= "N = G, C or 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GCCKGATGNC GRCGYNNNNN RCGYCTTATC HGGCCTAC 33 

* 

(2) INFORHATION FOR SEQ ID NO: 2: 

15 (i) SEQUENCE CHAElACTERISTrcS : 

(A) LENGTH: 38 base pairs 

(B) TtPEi nucleic acid 

(C) STRANDED NESS ; single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(ix) FEATUPvE: 

(A) OTHER IKFOBKATrO^I : /no-e- "N - Inosine" 
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(^i) SEQCTENCB DESCRIPTION: SSQ ID NO : 2 : 

GCCNGATGNC GNCGNNNNNN NCGNCTTATC NGGCCTAC 38 

(2) INFORMATION FOR SSQ ID NO: 3: 

(i) SEQUENCE CHAKACTERISTXCS : 
5 (A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) T0POX.OGY: linear 

(ii) MOLECULE TYPE; DNA {genomic) 
10 (iii) HYPOTHETICAL: YES 

(ix) FEATURE: 

(A) OTHER INFORKATION: /note-^ ^ Inosine" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.v3 : 
GCCKGATGNC GRCGYNNNNN RCGYCTTATC HGGCCTAC 38 
15 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IS base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



( ix ) FEATURE : 

(A) OTHER INFORMATION: /not:e=^ = "Inosine" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

mmNCGNCGN CATCNGCC IS 

(2) INFORKATION FOR SSQ ID NO: 5: 

(i) SEQUENCE CHARACTSHISTICS : 
5 (A) LENGTH: IS base pairs 

(3) TOPE: nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLCXlZr linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iii) HYPOTHETICAL: YES 



(Ik) FSATORE: 

(A) OTHER XNFORJKATION: /note- ^'N = Inosine at posit:ion #1- 
3; N = A, G, C or T at position #10" 



(xi) SEatlSNCS DESCHIPTXOK: SEQ ID N^O:5: 

16 mmRCGYCGN CATCHGGC 18 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IS base pairs - 
(3) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

CD) TOPOLOG-Y: linear 

(ii) KOLECULS TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: YES 
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(ix) FEATURE: 

(A) OTHER INFORMATION; /note- - Inosine at position #1- 
3; N = A, O, C or T at position #4, 7, 10, 15" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
5 NNNNCGNCGN CATCNGGC 18 

(2) INFORKATION FOR SEQ IB NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 
10 {C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
{ 1 i 1 ) HYPOTHET 1 C AL i YES 



( ix ) FEATURE : 

15 (A> OTHER INFOHKATION: /not:e== "N = Inosine" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

NCGNCTTATC NGGCCTAC 18 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTHi 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (cenGrr;ic) 
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10 



(Lii) HYPOTHETICAL: ^SS 

(xl) SEQOENCE DESCRIPTION: SEQ ID NO: 8 
RCGYCTTATC MGGCCTAC 
^2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



18 



(ix) FEATURE: - a G, C or T" 

(A) OTHER INFORMATION: /note= N - A, G, 



(xi) SEQUENCE description: SEQ ID NO:9 
X5 NCGNCTTATC NGGCCTAC 

(2) INroKKAOriON FOR SEQ ID NOtlO: 

(i) SEQUENCE CHAK^CTSKISTICS : 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 
2Q (C) STR2VNDED^rSSS : single 

(D) TOPOLOGV: linear 

(li) MOLECULE TYPE: DKA (genomic) 
(^ii) HYPOTHETXCr-X: YES 



IS 
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( ix ) FEATITRE : 

(A) OTHSK INFORMATION: /not:e= = Inosxne 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



10 

GNCATCNGGC 

6 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) I^ENGTH: 10 base pairs 
-(B) TVPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

(A) OTHER INFORMATION: /not:e= "K - A, G, C or a 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

10 



GNCATCHGGC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 
2Q (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL; YES 
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(ix) FSATURS: 

(K) OXHSH INF0RMA"TIOKi /note= = A, G, C or T'^ 



(xi) SEQUSNCS DSSCHIPTXOtI : SEQ ID NO: 12: 
GNCATCNCGC 10 
5 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUSNCS CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

(A) OTHER INFORHATIOtT: /not5= "N =^ Xnosine' 



15 (xi) SEQUENCE DESCRIPTION-: SSQ ID NO: 13: 

NCGNCATCNG GO 12 

(2) INFORMATIOfT FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA. (genomic) 
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(lii) HYPOTHSTICAL: YSS- 



(ix) FKATUKE: 

(A) OTHER INFORMATION: /note- "N - A. C or T 



(Xi) SEQUENCE DESCRIPTION: SSQ ID NO: 14: 

5 YCGNCATCMG GC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 
-^Q (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



12 



( 



ii) MOLECULE TYPE: DNA (genomic) 



(lii) HYPOTHETICAL: YES 



(ix) FEATURE: 

-^5 (A) OTHER INFORMATION: /not:e= - A, C or T 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

] 

NCGNCATCNG GC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
2Q (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOCY: linear 



wo 93/08297 



FCr/US92/09230 



(ii) MOr.ECUL£ XYPSr DNA (genomic) 
{ iil } HY^^OTHSTICAL : YSS 



(ix) FSATUBEi 

(A) OXHBK INrOHKATIONf: /not:e=: "K - Inosine" 



5 (XX) SEQUSNCE DESCHXPXXON: SSQ ID UOtl 

NCGNKNNNNN CGNCGNCATC NGGC 
(2) XNFORHATION FOE SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
^Bj TYPE: nucleic acid 
{ c ) STHANDEDNES S : s ingle 
<D) TOPOLOGY: linear 

(ii) HODECnLE TYPE: DNA (genomic) 
(ii.x) HYPOTHSTICAL: YES 



o 



24 



15 (i--^) FEATURE: 

(A) OTHER INFORHATION: /note- "N ^ Inosine 



(Xi) SEQUENCE DESCKIPTXON: SEQ ID NO: 17: 

24 

RCGYNKNNNR CGYCGNCATC MGGC 
(2) INFORMATION FOR SSQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDr^ESS : single 
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(D) TOPOIiOGY; linear 



(ii) MOLECULE TYPSi DNA (genomic) 



(iii) HYPOTHETICAL: YES 



( Lx ) FEATUHS : 

(A) OTHER INFORMATION: /noce= "N = Inosine at position 
#5-9; N = A, G, C or T at position ^Ir 4, 10, 13, 16, 21'* 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
NCGNNNNNNN CGNCGNCATC NGGC 24 
(2) INFORMATION ?OK SEQ ID NO: 19: 

10 (i) SEQUENCE CHARACTEKISTICS : 

(A) LENGTH; 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



( ix ) FEATURE : 

(A) OTHER INFORMATION: /not:e= "N = Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
20 ATAAGNCGNN NNNNNCGNCG NCATCNGGC 2 9 

(2) INFORMATION FOR SEQ ID NO : 2 0 : 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 29 base pairs 

(B) TYPSs nucleic acid 

(C) STHiUIDSDNSSS: single 

(D) TOPOLCOr: linear 

(ii) HOLECUIiE TYPS; DNA (genomic) 
{ill) KTPOTHETICAL: YES 



(ix) FEATURE: 

(A) OTHER INFORMATION: /note^ '^N = Inosine" 



(xi) SEQUENCE DESCRIPXION: SEQ ID NO:20: 
10 ATAAGRCGYN NNNNKCGYCG IJCATCHGGC 29 

(2) XNF0R>£ATXON FOR SS^ ID NO: 21: 

(X) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPEt nucleic acid 
15 (C) STRAtTDEDNESS : single 

,(D) TOPOLOGY: linear 

(ii) MOLECtJLE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

20 (A) OTHER INFORMATION: /not:e= "N = Inosine at position 

#10-14; N - Ar G, C or T at position #6, 9, 15, 18, 21, 16" 



(xi) SEQUENCE DESCRIPTION: SH:Q ID MO : 2 1 : 
ATAAGNCGNN NITNNNCGNCG NCATCNGGC 29 



wo 93/08297 



PCr/L'S92/09230 



-47- 



10 



(2) INFOKHATION FOR SSQ ID NO: 22: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STKANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 
(iii) HYPOTHETICAL: YES 



{ ix ) FEATURE : 

(A) OTHER INFORMATION: /note- "N - Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22 



10 



TCNGGCCTAC 



<2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 10 base pa 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 
20 (iii) HYPOTHETICAL: YES 



(Xi) SEQUENCE DESCRIPTION: SSQ ID NO: 23 



TCMGGCCTAC 



(2) INFOKHATION FOR SEQ ID NO: 24 



10 
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(!) SEQUENCE CHARACTER'S TICS : 

(A) T^SNGTH: 10 base pairs 

(B) TYPS: nuclaic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 
(iil) HYPOTHETICAL: YSS 



(ix) FEATURE: . t - :^ G C or T" 

(A) OTHER INFOEU4AT10K: /note- - A. r 



(xi) SEQUENCE DSSCRIPTXON: SSQ ID HO: 24 



10 



TCKGGCCTAC 



(2) INEORHATION POR SEQ ID NO:2 5: 

(1) SEQUBHCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
^B) TYPE: nucleic acid 

(C) STRAiTDEDtrSSS: single 

(D) TOPOLOGY: linear 

(ii) HOLBCmE TYPSr DNA (genomic) 
(ili) HYPOTHETICAL: YES 



90 fix) FEATURE: 

■ (A) OTHER INFORMATION: /note= "N = Inosxne 



If 



(xi) SEQUENCE DESCRIPTION: SZQ ID N 



rO : 2 5 



TATCNGGCCT AC 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDSDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genorr.ic) 
( iii ) HYPOTHETICAL ; YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
10 TATCMGGCCT AC 12 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 faase pairs 

(B) TYPE : nucleic acid 
15 (C) STRAKDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genorr.ic) 
( iii ) HYPOTHETICAL ; YES 



( ix ) FEATURE : 

20 (A) OTHER INFORMATION: /note=^ "N A, G, C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TATCNGGCCT AC 12 
(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SSQCTSNCS CfiARACTSR I ST I C3 : 

(A) LENGTH:- 24 base pairs 

(B) TYPE: nucleic acid 

(C) STKiUJDEDKSSS: single 
5 (D) TOPOLOGY 1 linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

(A) OTHER INFOHHATION: /note^ '^N Xnosine" 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

NNNNimNCGN CTTATCNGGC CTAC 2 4 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CEIAKACTSRISTICS: 
(A) LENGTH: 24 base pairs 
15 (B) TYPE: nucleic acid 

(C) STHAKDSDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA {genomic) 
(ill) HYPOTHETICAL: YES 



20 (Ix) EEATUHE: 

(A) OTHEH INFORMATION: /note= "N Inoslne" 



(xl) SEQUENCE DESCRIPTION: SEQ ID NOr29: 
YNNN>^NRCGY CTTATCMGGC CTAC 24 
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(2) INFORMATION FOR SEQ ID NO: 30; 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 2 4 base pairs 

(B) TtPE: nucleic acid 

(C) STRANDEDNESS: single 
( D } TOPOLOGY ; 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

10 (A) OTHER INFORMATION: /note== = Inosine at position 

#2-6; N = A, Q, C or T at position #1, 7, 10, 17" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 
NNNNNNNCGN CTTATCNGGC CTAC 24 
(2) INFORMATION FOR SSQ ID NO: 31: 

15 {i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 29 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDSDNESS : s ingle 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE; DNA {genomic) 

(iii) HYPOTHETICAL: YES 



( ix ) FEATURE : 

(A) OTHER INFORMATION: /nore= "N = Inosine" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31; 
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CGNCGNNNJJN NNCGNCTTAT CNGGCCTAC 2 9 

(2) INFOH^LSlTIOR for SSQ id NO: 32: 

(i) SEQUENCE CKARACTSHXSTICS : 
(A) LENGTH: 29 base pairs 
5 (B) TYPHt nucleic acid 

(C) STRAKDEDNESS ; single 

(D) TOPOLOGY: linear 

(ii) MOriECULS TYPE; DNA. (genomic) 
(ill) HYPOTHETICAL: YES 

10 (Ix) FEATURE : 

(A) OTHER INFOHMATION: /note= "N = Inosine" 



(Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 32: 
CGKCGYNKNN NHCGYCTTAT CHGGCCTAC 29 



(2) INFORMATION FOR SSQ ID NO: 33: 



15 (1) SEQUENCE CHABACTERISTICS : 

(A) IrENGTH: 29 base pairs 
{B> TYPE: nucleic acid 
{C> SXRANDSDNESS : single 
(D) TOPOLOGY: linear 



20 (ii) MOLECULE TYPE: DNA (genomic) 



(lii) HYPOTHETICAL: YES 



( ix } FEATURE : 

(A) OTHER INFORMATION: /nocs^ = Inosine at posi-tion 

#7-11; N A^ G, C or T at position #3, 5, 12, 15, 22" 
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{xi) SEQUKNCE DESCRIPTION: SEQ ID NO: 33: 

CGNCGNNNNN NNCGNCTTAT CNGGCCTAC 29 
(2) INFORMATION FOR SSQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IjENGTH; 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE XyPE; DNA (genomic) 



10 (iii) HYPOTHETICAL; YES 



(ix) FEATURE: 

(A) OTHER INFORMATION: /note= = Inosine" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
NNNNACGCCG CATCCQGC 18 
15 (2) INFORMATION FOR SEQ ID KO:35; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

{ C) STRANDEDNSSS : s ingle 
20 { D ) TOPOLOGY : 1 inear 

(11) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 
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18 



TCGGC1TATC GGGCCTKC 

(2) INh^ORMATION for SEQ id NO: 36: 

(1) SEQUSNCS CHARACTERISTICS: 

(A) LENGTH: 126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



11 



(A) OTHER INFORMATION: /notS= "N- = A, G, C or x 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
..X.CHC«« .TH.™ ™c^..= CC.a=^S„K ...O^TVCC C.C..CC«T. eO 
S..^ST.«. T..CXO=b.T .»CH»C=H .==C^CCC. =«TaC^... 0^.=- 

126 

15 GRGKAT 

(2) INFORMATION FOR SEQ ID NO: 37: . 

(i) SEQUENCE CHARACTERISTICS X 

(A) LENGTH: 126 base pairs 

(B) TYPE: nucleic acid 
2Q (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL; YES 



wo 93/08297 



PCT/L'S92/09230 



-DD- 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TATACCCAAA ATAATTCGAG TTGCAGCAAG GCGGCAAGTG AGTGAATCCC CAGOAGCTTA 60 
CATAAGTAAG TGACTGCGGT GAGCGAACGC AGCCAACGCA GCTGCAGCTT GAAATATGAC 120 
GGGTAT 125 
5 (2) INFORMATION FOR SSQ ID N0:3S: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYJ'E: nucleic acid 

(C) STRAKDEDNESS: single 
10 (D) TOPOLOGy: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
ATGTAAGCTC CTGGGGATTC AC 22 
15 (2) INFORKATXON FOR SEQ ID NO:39: 

(i) SEQUENCE CHAHACTSRISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



{ ix ) FEATURE : 

(A) OTHER INFOR>lATION' : /notie-^ = Inosine 
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(Xi) SEQUENCE DSSCKXPOriON: S£Q ID NO: 39: 

22 

ATNTANGCTC CNGGGNATTC AC 
(2) lOTOHMATXON FOR SEQ ID NO: 40 : 

(i) SEQUENCE CHAKACTSRXSTICS : 
g (K) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRKNDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iil) HYPOTHETICAL: YES 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

2: 

ATSTAWGCTC CYGGOHATTC AC 
(2) INFORHATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STKANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 
20 (ill) HYPOTHETICAL: YES 



(iX) FEATU?^: _ ^ ^„ 

(A) OTHER INFORHATION: /note= 



(xi) SEQUENCE DESCRIPTION: SEQ : 
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ATNTANGCTC CNGGGNATTC AC 22 
(2) INFORHATION FOR SSQ ID NO: 42: 

(i) SSQUSNCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS: single 

(D) TOPOLOGV: linear 

(ii) MOI.EC0L£ TYPE; DNA (genomic) 
(iii) HYPOTHETICAL: YES 



10 (xi) SEQUEKCE DESCRIPTION: SEQ ID NO: 42: 

AAGTAAGTGA CTGGGGTGAG CG 22 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 faase pairs 
15 (B) TYPE: nucleic acid 

(G) 5TRANDEDNESSi single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



20 (ix) FEATURE: 

(A) OTHER INFORMATION; /note= '^N = Inosine" 



(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 43: 
AANTAiVGTGA CTGGGNTGAN C 21 
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(2) XN?OKHATXON FOR SSQ ID NO: 44: 

(i) SEQUENCE CHARACTERXSTXCS: 

(A) LENGTE: 21 base pairs 

(B) TYBE; . nucleic acid 

g (C) STRAjNDEDtraSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECUX.E TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 

(Xi) SEQUENCE DESCKIPTXON: SEQ ID NO: 44 
10 AASTAWGTGJ^ CTGGGRTGAR C 

(2) INFORHATXON FOR SEQ XD NO: 45: 

(i) SEQUENCE CHAKACTEKISTXCS : 

(A) I^ENGXH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STHANDEDNSSS r single 

(D) TOPOLOGY: linear 

(ii) HOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



21 



(ix) FEATURE! ^ r or T 

2Q (A) OTHER INFORMATXON: /note- "N - A, q 



(Xl) SEQUENCE DESCRIPTION: SSQ XD NO: 45: 

21 

AANTAI-JGTGA CTGGGNTGAN C 



(2) XNFOKHATXON FOR SSQ ID NO: 46 
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(i) CHARACTERISTICS; 

(A) LENGTH: 27 base pairs 
(•B) T^PE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 



(ix) FEATURE: 

(A) OTHER INFORHATION: /note== '^N = A, G, C or T at ail 
10 locations; and at location 8 the N can be omitted to form a 26 mer 

sequence" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GTNCNGNNTT TTTGTTAATN CNCTATA 27 

(2) INFORMATION FOR SEQ ID NO: 47: 

16 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) KOLECULE TYPE; DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47; 
GTACCGGTTT TTGTTAATTC ACTATA 2 6 



(2) INFORMATION FOR SEQ ID NQ:46: 
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(1) SEQUSNCS CHARjlCTSRISTICS : 

(A) ZS^GTH: 14 base pairs 

(B) TYPE? nucleic acid 

(C) STRANDEDNSSS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4S? 

ACAAAAACCG GTAC 14 

10 (2) INFORHATION FOR SSQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
CB) TYPE: nucleic acid 
(C) STKANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(xi) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 



(Ix) FEATURE: 

(A) OTHER INFORMATION: /nate= "N == Inosine 



11 



20 (xi) SEQUENCE DESCRIPTION: SSQ ID NO: 49: 

ACAAAAAJttCN GNAC 

(2) INFORMATION FOR SEQ ID NO; 50: 

(i) S^QUEtiCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 



1 



A 
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(B) TYPS : nucleic acid 

(C) STHANDEDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iii) HYPOTHETICAL: YES 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

ACAAAAAYCR GKAC 14 

(2) INFORMATION FOR SBQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 14 base pairs 

(B) TYPS: nucleic acid 

( C ) STRANDED^^ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: YES 



( ix ) FEATURE t 

(A) OTHER IKF6R>iATI0N: /note= "N = A, G, C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 
ACAAAAANCN GNAC 14 
20 (2) INFOPJKATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNSSS: Single 
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(D) TOPOLOGY: linear 
(ii) MOLSCULE TYPE: DNA (genomic) 
(ill) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 52 
5 GTTAATTCAC TATA 

(2) INFORMATION FOR SSQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS r 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHHTXCAL : YES 



14 



( ix ) FEATURE : 

(A) OTHER INFORHATXOK: /note- "N - Xnosine 



(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 53: 

14 

GTTAATNCNC TATA 

(2) INFORMATION FOR SSQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHSTXCiU.: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
GTTAATYCHC TATA 14 
5 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STKANDSDNESS.: single 
10 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAI-: YES 

(ix) FEATURE: 

(A) OTHER INFORMATION: /note- "N = A, G, C or T" 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GTTAATNCNC TATA 14 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA {genomic) 
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(lii) HYPOTHSTICAX.: ^^3 



■ (Xl) SEQCraNCS DESCRIPTION: SSQ ID NO : 5 5 : 

18 

CGAGCTGTCC CAGTCCGC 

(2) INFOKHATION FOR SEQ ID NO: 57:. 

5 (i) SEQUENCE CKARACTERXSXXCS : 

(A) LENGTH: 18 base pairs 

(B) TYPEt nucleic acid 

(C) STRiUIDEDNESS: single 

(D) TOPOLOGY": linear 

10 (ii) MOLECULE TYPE: UNA (genomic) 

(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 57: 
GCGOACTGGG ACAGCTCG 
(2) INFORMATION FOR SEQ ID NO: 58: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STHANDEDNESS: single 

(D) TOPOLOGY: linei 



1 3 r* 



20 (ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CAGCCATGAA CAACTC-GTGG CG ^2 
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(2) INFORiiATlON FOR SEQ ID NO: 59: 

(i) SEQUENCE CKARACTERIStlCS ; 

(A) LENGTH: 22 base pairs 
(E) TYPE: nucleic acid 

(C) STKANDEDNESS: single 

(D) TOPOLCX5Y: linear - 



(ii) MOLECULE TYPE: DNA (genorr.ic) 
(iii) HYPOTHETICAL: YES 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



10 TGCTTTGCGC AGGGAAGATT CC '22 

(2) INFORMATION FOR SEQ ID NO: 60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
<iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

20 YTAGAGYATT TGMCAAAAAG ACGCAACGTC TTTTTGGCGR GCGGACTGGG ACAGCTCGMA 60 

GAGRGCGAGT GCAAAACACK GAGCAGGGCG 90 
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CLAIMS 

What we claim is: 

1. A method for identifying a strain of bacteria, comprising the steps 
of: 

5 . amplifying DNA between repetitive sequences in a 

sample containing said bacteria by adding a pair of 
outwardly-directed primers to said sample, said primers 
capable of hybridiziQg to repetitive DNA sequences in the 
bacterial DNA and extending outwardly from one 
10 hybridizabie repetitive sequence to another hybridi^able 

repetitive sequence; 

separatiQg the extension products generated in the 
amplification step by size; and 

determining the specific strain of bacteria by 
15 measuring the pattern of sized extension products. 

2. The method of claim 1 wherein the hybridizable repetitive sequence 
is selected from the group consisting of repetitive extragenic 
palindromic elements (REP), enterobacterial repetitive intergenic 
consensus sequence (ERIC), Neisseria repetitive extragenic 
20 elements (Ngrep), Deinococcus repetitive extragenic elements 

(Drrep) and any combination thereof 
3- The method of claim 1, wherein the primers are between about 10 
mer and 29 mer, 

4. The method of claim 1 wherein the primers are 15 mer to 25 mer. 

25 5, The method of claim 1 wherein the repetitive sequence is REP and 

one member of the pair of primers is selected from the group 
consisting of SEQ, ID, Nos. 4, 5, 6, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
19, 20, 21 and 34; and the other member of the pair is selected 
from the group consisting of SEQ- ID. NOS. 7, 8, 9, 22, 23, 24, 25, 

30 26, 27, 28, 29, 30, 31, 32, 33 and 35, 



wo 93/08297 PCr/US92/09230 



-67- 



6. The method of claim 1, wherein the repetitive sequence is ERIC 
and one member of the pair of primers is selected from the group 
consisting of SEQ. ID. NOS. 38, 39, 40 and 41; and the other 
member of the pair is selected from the group consisting of SEQ, 

5 ID. NOS. 42, 43, 44 and 45/ 

7. The method of claim 1, wherein the repetitive sequence is Ngrep 
and one member of the pair of primers is selected from the group 
consisting of SEQ. ID. NOS. 48, 49, 50 and 51; and the other 
member of the pair is selected from the group consisting of 52, 53, 

10 54 and 55. 

8. The method of claim 1, wherein the primers are SEQ. ID. NO. 4 
and SEQ. ID, NO. 7. 

9. The method of claim 1, wherein the repetitive sequence is ERIC 
and the primers are SEQ. ID. NO. 38 and SEQ. ID. NO. 42. 

16 10. The method of claim 1, wherein the repetitive sequence is Ngrep 

and the primers are SEQ. ID, NO, 48 and SEQ. ID. NO. 52. 

11. The method of claim 1, wherein the repetitive sequence is Drrep 
and the primer is SEQ. ID. NO. 56 or SEQ. ID, NO. 57. 

12. The method of claim 1, wherein a plurality of pairs are added and 
20 wherein each pair is to a different repetitive sequence, 

13. The method of claim 12, wherein primers are selected from the 
group consisting of REP, ERIC, Ngrep, Drrep and any combination 
thereof. 

14. The method of claim 12, wherein the primers are SEQ. ID. NOS, 
25 4, 7, 38 and 42, 

15. The method of claim 1, wherein the DNA is extracted from the 
bacteria prior to adding the primers. 

16. The method of claim 1, wherein the separating step includes gel 
electrophoresis of the extension products. 
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The method of claim 16, wherein the extensioa products are stained 
with ethidimn bromide. 

The method of claim 1 wherein the primers are labelled and the 

determiniag step includes measuring the pattern of labelling. 

The method of claim 1 wherein the separation step includes 

chromatography of the extension products. 

The method of claim 19, wherein the primers are labelled. 

The method of claim 19, wherein the label is a fluorescer. 

The method of claim 1, wherein the sample contains a plurality of 

bacteria and wherein each specific strain of bacteria is distinguished 

by its unique size pattern of extension products. 

The method of claim 22, wherein the sample is selected from the 

group consisting of blood, urine, spinal fluid, tissue, vaginal swab, 

stool, amniotic fluid, and buccal mouthwash. 

The method of claim 23, wherein the sample is from a human 
subject. 

The method of claim 23, wherein the test sample is from an animal 
subject. 

The method of claim 22, wherein the test sample is an agriculture 
sample. 

The method of claim 22, wherein the test sample is food. 

The method of claim 22, where in the test sample is an 

environmental sample. 

The method of claim 22, wherein the test sample is a horticulture 
sample. 

The method of claim 1 for diagnosing bacterial disease whereia the 
sample is collected from a subject suspected of having a bacterial 
disease. 

The method of claim 30, wherein the subject is a human. 
The method of claim 30, wherein the subject is an animal. 
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33. The method of claim 30, wherein the subject is a plant. 

34. The method of claim 1 for monitormg bacterial contamination in an 
environment wherein the sample is collected from an environmental 
source suspected of being contaminated; 

5 35. The method of claim 34, wherein the environmental source is a 

liquid. 

36. The method of claim 34, wherein the environmental source is 
sludge. 

37. The method of claim 34, wherein the environmental source is 
10 sewage. 

38. The method of claim 34, wherein the environmental source is a 
treatment plant. 

39. The method of claim 34, wherein the environmental source is soiL 

40. The method of claim 1 for monitoring bacterial contamination of 
15 food wherein the sample is collected from food suspected of being 

contaminated. 

41. The method of claim 40, wherein the food is infant formula. 

42. The method of claim 40, wherein the food is sea food. 

43. The method of claim 40, wherein the food is fresh produce. 
20 44. The method of claim 40, wherein the food is processed food. 

45. The method of claim 1 for monitoring a bacterial population at a 
bioremediation site wherein the sample is collected from said site, 

46. The method of claim 45, wherein the sample is soil 

47. The method of claim 45, wherein the sample rs liquid. 
25 48. The method of claim 45, wherein the sample is sludge. 

49, The method of claim 45, wherein the sample is from the bacteria 
which is to be added to the site. 

50. The method of claim 1 for monitoring a horticulture sample 
wherein the sample is collected from a horticulture source to be 

30 tested. 
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51. The method of ciaim 1 for monitoring aa agriculture sample 
wherein the sample is collected from an agriculture source to be 
tested- 

52. The method of claim 1 for monitoring' bacterial additions to an 
5 agricultural environment wherein the sample is collected from an 

agriculture source to be tested, 
. 53. The method of claim 52, wherein the sample is a liquid. 
64. The method of ciaim 52, wherein the sample is soil. 
55. The method of claim 52, wherein the sample is from a plant, 
10 56. The method of claim 52, wherein the sample is from an animal. 

57. The method of claim 1 for monitoring manufacturing processes for 
bacteria wherein the sample is collected from the process to be 
tested- 

58. The method of claim 57, wherein the sample is selected from the 
15 group consisting of drug manufacturing processes, fermentation 

processes, microorganism-aided synthesis processes, chemical 
manufacturing process and food manufacturing processes, 

59. The method of claim 1 for quahty assurance/quality- control of 
laboratory tests involving microbiological assays wherein the 

20 sample is collected from the bacterial stock to be tested. 

60. The method of claim 1 for tracing outbreaks of bacterial infections, 
wherein the sample is collected from an organism to be tested. 

61. A method for genome mapping, comprising the steps of; 

fractionatiag the genome; 
25 cloning the fractionated genome into a vector; 

testing the cloned vectors by amplifying bacterial DNA in the 
clones by adding a pair of outwardly-directed primers to the test 
sample, said primers capable of hybridizing to repetitive DNA 
sequences in the bacterial DN'A and extendiag outwardly from the 
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hybridizable repetitive sequence to another hybridizable repetitive 
sequence; 

separating the extension products of the amplification step 
by size; and 

5 measuring the pattern of extension products; and 

reconstructing the genome from the overlapping patterns. 
62. A method for automated identification of a strain of bacteria 
comprising the steps of: 

adding bacteria and outwardly-directed PGR primers to a 
10 test sample in an auto-PCR instrument, wherein said primers are 

capable of hybridizing to repetitive DNA sequence in the bacterial 
DNA and extending outwardly from the hybridizable repetitive 
sequence to another hybridizable repetitive sequence. 

transferring the extension products from the PGR assay and 
15 separating the extension products; 

measuring the sizing pattern of said separated extension 
products with a measuring means; and 

recognizing and identifying the sizing pattern by a computer 
means. 

20 63. The method of claim 62, wherein the measuring means is selected 

from the group consisting of a bar code reader, a laser reader, a 
digitizer, a photometer and a fluorescence reader. 
64. The method of claim 62 wherein the extension products are 
separated by chromatography or gel electrophoresis. 

25 65. The method of 62, wherein a sample from the PGR amplification is 

applied to a gel and electrophoresed; the size pattern is read by a 
measuring means; and the pattern is compared by the computer 
means with stored known bacterial patterns. 
66. The method of 62, wherein the separated extension products are 

30 stained with ethidium bromide before readinsr. 
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The method of 62, wherein the primers are labelled. 
The method of 62, wherein the primers are labelled with iluorescer, 
A method of identifying a strain of bacteria in a test sample, 
comprising the steps of: 

amplifying DN*A in said bacteria by adding a plurality of 
pairs of outwardly-directed primers to said test sample, each pair 
of said primers capable of hybridizing to different repetitive DNA 
sequences in the bacterial DNA and each pair extending outwardly 
from its hybridizable repetitive sequence to another of its 
hybridizable repetitive sequences and wherein each pair is 
differentially labelled; 

separating the extension products generated in the 
amplification step by size; 

determining the specific strain of bacteria by measuriag the 

pattern of si^ed extension products for each pair. 

The method of 69, wherein the labels are fluorescers. 

The method of 69, wherein the separation is by gel electrophoresis, 

capillary electrophoresis, mass spectrometry or chromatography. 

The method of claim 69, wherein the -primer pairs are selected from 

the group consisting of REP, ERIC, Ngrep, Drrep and any 

combination thereof/ 

A kit for determining the identity of strains of bacteria, comprising 
a container having outwardly-directed PGR primer pairs to 
repetitive sequences in bacteria. 

The kit of claim 73, wherein the PGR primer pairs are selected 
from the group consisting of SEQ. ID. NOS. 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 38, 39, 40, 41, 42, 43, 44, 45, 48, 49, 50, 51, 
52, 53, 54, 55, 56, 57 and any combination thereof. 
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75. As a composition of matter the sequences selected from the group 
consisting of SEQ, ID. NOS. 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 and 
33. 

5 76. As a composition of matter the sequences selected from the group 

consisting of SEQ, ID. NOS. 38, 39, 40, 41, 42, 43, 44 and 45. 

77. As a composition of matter the sequences selected from the group 
consisting of SEQ. ID. NOS, 48, 49, 50, 51, 52, 53, 54 and 55. 

78. As a composition of matter the sequences selected from the group 
10 consisting of SEQ. ID. NOS. 56 and 57, 

79. A^ a composition of matter sequences selected from the group 
consisting of SEQ. ID. NOS. 4 and 7. 

80. As a composition of matter sequences selected from the group 
consisting of SEQ. ID. NOS, 38 and 42. 

15 81. As a composition of matter sequences selected from the group 

consisting of SEQ. ID. NOS, 48 and 52. 

82, A machine for identifying a strain of bacteria comprising: 

an automated PGR amplifying means; 
a separation means; 

20 a sampling means for removing the extension products from 

the PGR means and transferring them to the separation means; 

a reading means for measuring patterns of extension 
products after separation on the separation means; 

a computer means for recording the results of the reading 
25 means and for outputting the pattern and identifying the strain of 

bacteria. 

83. The machine of claim 82, wherein the separation means is selected 
from the group consisting of a gel electrophoresis apparatus, 
capillary electrophoresis apparatus, mass spectrometer and a 

30 chromatographic apparatus. 



PCT/US92/09230 



-74- 

The metkod of claim 82 wherein, the reading means is selected 
from the group consisting of a digitizer, a bar code reader, a laser 
detector, a fluorescence detector, a photometer and a radioactive 
detector. 

A machine for identifying a strain of bacteria comprising: 
a thermal cycler; 

a separator selected from the group consisting of a gel 
electrophoresis, capillary electrophoresis and chromatographic 
apparatus; 

a detector selected from the group consisting of a digitizer, 
a fluorescence detector and a photometer; 

a robotic apparatus for handling samples and moving them 
from one location to another; and 

a computer means for regulating the operation of the 
machine, for collecting and storing the information and for 
comparing the genomic fingerprints to identify the bacterial strain; 
wherein the thermocycler, detector and separator are attached to 
the computer and integrated into a cabinet. 

The method of claim 1, wherein the blood sample is tested for 
bacterial contamination and the sample is stored blood or blood 
used for transfusions. 

The method of claim 86, wherein the sample is tested for the 

specific bacterial species Yersinia enter ocoUtica. 

A method for identifTing a strain of bacteria, comprising the steps 

of: 

amplifying DNA between repetitive sequences in a sample 
containing said bacteria by adding a primer to said sample, wherein 
said primer is hybridizable to the repetitive sequence in either of 
the complementary strands of the DNA and wherein the hybridized 
primer extends from one repetitive sequence across non-repetitive 
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DNA to another repetitive sequence and wherein said extension 

product is hybridizable by the primer for generations of further 

extension products; 

separating the extension products by size; and 
determining the specific strain of bacteria by measuring the 

pattern of sized extension products. 
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1 kb ladder 
fiEPALt-D 







= -f* ■ V■■^"■^ 




-REP2-r 

•REPtR-I. REP2-I 

-REPlR-b, BEP2-D 

:.REP1R-Dt 
^iteg. control 

V ■ 

iRIC2. ERICi R 
riieg, control 

I kb ladder 



O ' o o 



Fiaure 4 
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1Q18 bp 
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y^lvkb ladder 
7. '£: cd/f:- W311.0 
^'cd// 2955 . 

'^Shtgem sp. .1 70 • 
■S^- iypbmurwm . LT-2 

■datMpndUa .s^. 4077 
SalmonBiia :sp, 4340 
SalmoneilB. "sp, 4359 

■ * ■** 

a^cf/viersu^r 1216ml23 

dJv^rsus ^ 4036 
^K, paeumonme . - 
iC. pneumartiBe A732 

£ SBkazakii 4585 
P^.B^ruginosa 4938 
P..aem0tadsB 5014 
h^g. cdntro} 
1 Jkb JacTder 



Figure 6 
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:1jHb Judder. 
■■:fecS//*5HB101 

Js. iypM /2304 



^^\Sa)m^^ sp; 4077 
— SalrricmBna. sp/4340 



-t—--SMm^^ sp,-435S 
C^ dli/ersus i216m123 



y^JCi^diversus 4036 

pneumoniae 4732 
•^EJ'sakazakii 4584 
: E::sBkazaki} 4585 
^ -p; aerpgmOBa 4998 
. P, aewgmoss^ 501 4 
neg; eontroi 
1 kb ladder 



re / 
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7 . RhodobM:i4sr 

7. E. coil 
W3110 

13. Prates 

IB. VI tiro 

vulnificus 



2 Rhizobium 
nutiUoti 

sp. 



11. 



17, MyxococciJs 



20. 



0titidi9CsviMrvm 

22. BmdUuB 
MUbtiUs 

25. Stnptocx^ccus 
28, ktfeopim$m^ 

pdlidkjm 

34: fu^ot^a^dum 
nud^tvm 

37. IMinococcvis 
rmdiophilu9 

40. Hmrpeiosip^on 
gigkntmjs 

43. Schizosxchmrowycss 44. 
pombe 



situs ^ 



23. iJst^ri^ 



26. Group B 



29. Amt&sm^ 



22. Tm^^f^'^f^ 



35. 



38. 



41 



m^^^^m^pticum 

7h9rmu9 
eqasUcuM 

^^siobium 



3. MM^M 
gonorrt7S^ 

6. E. coli 
HB101 

atrobm:t^ 

12. SarrMtia 
mMrcmsc^s 

15. Xmnthomonss 
wmnihctis 

18. Mthrobmci^ 

21. Uycobmcimrium 
Murvm 

24. SUphyiococcus 
Bursas 

27. Csryophanan 

30. BarrmliB 
t<Jirgdori0ri 

33. Bmctmrokies 
trsgiiis 

36. Flmvt}bmci&rium 
ok0mnokait03 



39^ Th^rntus 

thmrmophllus 

42. Sscdmromyc&s 
c*rmvimi»^ 

45. Homo 
%mpiTis 
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1 kb ladder 
S. aureus 

Group B Streptococcus 

C. ia tum 

R bmgdotiefi 
T. palVidum 

phsgadmis 
B. fragUis 
F. nudeaium 

F. menrngosepticam 

D. ,wdtophHus 
T, aquatfcus 
7. ih&rmophiJu^ , 
H, giganteus 
H, hafobium 

pambe 
C pBrapsUosis 
neg, control 
1 kb ladder 



cr 



Figur 



e 9A 
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Co 
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1 kb ladder 

M meningWdis 
SphaeroUlus sp, 
£ coll H8101 
co/i V/3110 

K,' pneumoniae 

■ S. mBrcescens 
P. vulgaris 
P. Beruglnosa 
X. manfholls 
V, vulnificus 
xanthus 

N. otWdtscaviarum 
aibus G 
B urum 
B,sabm$ DB-2 
L. monocytogenes 
i ki3 tedder 



Fiaure 
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/. Rhodobscter 
spha^oides 

niBninghldis 

7, E, coll 
W3110 

10. Ki^bsioila 
pneumoniae 

13. Proteus 
vulgaris 

16. VI bra 

vuinificus 

19, Nacsrdis 

otiiidiscsviarvm 

22. BscWus 
subtiiis 

25. Streptococcus 
prwumonise 

28. Mycopimsma 
pneamaniBG 

3U Treponema 
pamdum 

34. Fusob^cterium 
nucieatum 

37. Deinococcus 
rBdiophilus 

40. H&rpetosJphon 
gsganteus 

43. 



2. Rhizohium 
meilloil 

5 . Sphserotilus 
. sp. 



8. 



sp. 



3, Neisseria 
gonorrheae 

6, E. coll 
HB101 

9, CItrobmcter 
diversus 

12. Serratis 
marcmsceas 

15, Xanthornonas 
max^hotis 

18. Arthrobecter 
luteus 

21. Mycobacterium 
aurvm 

24. Staphylococcus 
aureus 

27. Caryophanon 
latum 

30. Borreiia 
burgdorferi 

33. Bacieroides 
fragilis 



35. Fla\/ob&cterium 36. Ravobmcterium 
mmiingosepticum okeanokoltes 



1 1, Enterob-^iH' 
SMkSiZakH 

14. P^eadomonas 
»&fvglr>:?$ta 

1 7. Myxococcus 
xm%thu& 

20. Streptornyc^s 
aJbiJS G 

23. Listeria 

wonocytogerws 

26. Group B 

Streptococcus 

29. ArwtMena 

sp. 

32, Treponema 



33. Thermus 
aquaticus 

4 1 . Haiobacterium 
hmiobium 



Schizosaccharomyces 44. Candida 
pombe parapsJIosJs 



39. Thermus 

thermophiJus 

42. Saccharomyces 
cerevisiae 

45. Homo 
sapiens 



Figure 1 0A 
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1 kb ladder 
R sphaeroides 

K gonorrheae 
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Sphaerolilas sp. 
E. CO// HB1G1 
E, C4?// W31 1 0 
SalmoneUa sp. 
C, diversus . 
K. pnewfioniae 
sBkazakil 
marc&scens 
P. vulgaris 
P. aerugmoss 
X m^nihotis 
VI wlfffficus 
xanihus 

N. ottlidiscaviarum 
S. Blbus G 
W, Burum 
B.subiUis DB^2 
I, monocytogenes 
1 kb ladder 



Figure 11 A 
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S. pneumoniae 
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ill. pneumoniae 
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7. phagedenis 
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E nu cleat um 

E menmgo septic am 
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D, radlophilus 
Z aqvaticus 
T. thermophiius 
H. gfQBvteus 

K halobium 
S. cerevlsise 

C. parapsilosis 
neg* control 
1 kb {adder 



V 



Figure 11 B 



wo 93/08297 



PCT/US92/O9230 



1 6/1 9 



3054 bp;-^ 
2036 bp;-r; 

'r ^ " 

1018 fop^-T^ 



506,517 bp^-rr 




1500 bp 



SOQ.bp 



Ngrep2 NgraplB 



PGft product 



Ngrep2 NgrepiR 



Figure 1 2 
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